Introduction

Some Informations about the projekt here would be really nice.

Analysis

2.1a Fraction of Xenelogs vs. Number of Genes

Q: Is there a dependence on the size of the gene tree, i.e., the number of species and genes? ?

Short description of the task at hand.

Tab. 1: Dependecy of the size of the gene tree in respect to th number of species. Results of for each Group, slope und intercept of a linear model were calculated as well as the spearman correlation value.
Group Duplication_Rate Loss_Rate HGT_Rate Slope Intercept Spearman_Corr
P0 0.25 0.25 0.25 0.0014 0.23 0.21
P1 0.50 0.50 0.50 -0.0009 0.39 0.05
P2 0.50 0.50 1.00 -0.0012 0.48 -0.02
P3 0.50 0.50 1.50 -0.0021 0.58 -0.16
P4 1.00 1.00 0.50 -0.0011 0.40 0.04
P5 1.00 1.00 1.00 -0.0016 0.53 -0.14
P6 1.50 1.50 1.50 -0.0016 0.57 -0.24

2.1a Plots: Fraction of Xenelogs vs. Number of Genes

Each plot needs a short discription. This can be done here. Maybe it is better not to iterate over the goups. maybe we should split the following coe into \(7\) seperate sektions, so we can write a custom text for each. But I guess an overall discription is fine as well... less work ;)

Some text here. . .

2.1a Fraction of Xenologs vs. Number of Species

Each plot needs a short discription. This can be done here. Maybe it is better not to iterate over the groups. Maybe we should split the following code into \(7\) seperate sektions, so we can write a custom text for each.

Tab. 2: Dependecy of the size of the gene tree in respect to th number of species. Results of for each Group, slope und intercept of a linear model were calculated as well as the spearman correlation value.
Group Duplication_Rate Loss_Rate HGT_Rate Slope Intercept Spearman_Corr
P0 0.25 0.25 0.25 0.0015 0.23 0.09
P1 0.50 0.50 0.50 0.0004 0.34 0.04
P2 0.50 0.50 1.00 0.0004 0.42 0.02
P3 0.50 0.50 1.50 -0.0022 0.55 -0.09
P4 1.00 1.00 0.50 0.0002 0.34 0.02
P5 1.00 1.00 1.00 -0.0016 0.49 -0.04
P6 1.50 1.50 1.50 0.0001 0.46 0.00

2.1a Plots: Fraction of Xenelogs vs. Number of Species

Each plot needs a short discription. This can be done here. Maybe it is better not to iterate over the groups. Maybe we should split the following code into \(7\) seperate sections, so we can write a custom text for each. But I guess an overall discription is fine as well... less work ;)

Some text here. . .

2.1b Fraction of Xenelogs with fixed HGT

How does the fraction depend on the rate of duplications and losses for a fixed horizontal transfer rate?

Some text was was done here. And maybe why?

Plot: Duplication Rate

Short introduction

Boxplot of the Fraction of Xenologs plotted against the duplication rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

Boxplot of the Fraction of Xenologs plotted against the duplication rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

In XX we see examples of plotting in R. Explanation of the graph

Plot: Loss Rate

Short introduction

Boxplot of the Fraction of Xenologs plotted against the loss rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

Boxplot of the Fraction of Xenologs plotted against the loss rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

In XX we see examples of plotting in R. Explanation of the graph

2.1c

How does the fraction depend on the horizontal transfer rate on the the rate of duplications and losses?

How does the fraction depend on the horizontal transfer rate with a fixed duplication and loss rate? --> fixed Question??

2.1c Plots Fraction vs. HGT fixes Loss

DIE PLOTS SIND NICHT ZUFRIEDENSTELLEND. PAUL HAST DU EINE TOLLE IDEE WIE WIR DAS GUT DARSTELLEN KÖNNEN

Ab Zeile \(205-258\) im tree_analyze.RScript falls da jmd rumexperimentieren mag. Überschrift "To-Do"

Boxplot of the Fraction of Xenologs plotted against the loss rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

Boxplot of the Fraction of Xenologs plotted against the loss rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

2.1c Plots Fraction vs. HGT fixed Duplikation

DIE PLOTS SIND NICHT ZUFRIEDENSTELLEND. PAUL HAST DU EINE TOLLE IDEE WIE WIR DAS GUT DARSTELLEN KÖNNEN

Boxplot of the Fraction of Xenologs plotted against the loss rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

Boxplot of the Fraction of Xenologs plotted against the loss rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

2.1d

How does the fraction depend on the frequency of multifurctions.

2.1d Plot

AUCH HIER GEFÄLLT MIR DER PLOTT ÜBERHAUPT NICHT. BOXPLOT ÜBER ALLE GRUPPEN? ODER 6 EINZELNE PLOTS FÜR JEDE GRUPPE MIT LINEAREM ODER x^2 MODEL? ODER WIR BILDEN "BUCKETS" MIT x=NON_BINARY_PROB VS y=FRACTION_OF_XENOLOGS. und wir bilden buckets mit 1 = [0-0.1], 2 = [0.1-0.2], 3 = [0.2-0.3] usw..

caption = "Wasn hier zu sehen??"

plot(x = treeDataDf$non_binary_prob,
     y = treeDataDf$Fraction_of_Xenologs)
Wasn hier zu sehen??

Wasn hier zu sehen??

2.2 Fitch from LDT with CD

Second we consider the dependencies for the edges in Fitch graphs computed from an LDT graph. Here the following variants should be considered:

  • Complete multipartite graph obtained by solving the Cluster Deletion Problem for the complement of the LDT (see webpage).
  • The \(rs-Fitch\) graph of the scenario computed with “Algorithm 1” from Rbelow.pdf (the latter is already implemented in AsymmeTree).@

@Paul, hast du hier nicht schonmal was angefangen?

In meinem Script finde ich dazu nichts.

Plots

Die können hier eingefügt werden

3. Triples: Characterization of LDT Graph

The triple set \(T (G)\) is related to the gene tree, while the triple set \(S(G, σ)\) is related to the species tree. It is therefore of interest to compare to what extent \(T (G)\) and \(S(G, σ)\) overlap the triple sets of true gene tree and the triple set of the true species tree, respectively. How can this be quantified in a meaningful way? Again we are interested in the dependence of the simulation parameters.

Tab. 3: Some Caption
X.Group.recall_cd_mean_100.recall_cd_mean_80.recall_cd_mean_60.recall_cd_mean_40.recall_cd_mean_20
1,P0,0.672386633942388,0.657477720796425,0.646072972287176,0.614475266441951,0.501995242146274
2,P1,0.671875514201965,0.66416265889012,0.642921884788269,0.59008237519843,0.509849656760941
3,P2,0.671808232548621,0.660098418214705,0.657681791624073,0.633563784410422,0.562160669613325
4,P3,0.699680060356936,0.697803812412199,0.680493792228268,0.646885477893486,0.583285494684965
5,P4,0.643095991643731,0.632279087257728,0.631610521028057,0.583621065769933,0.489561551776366
6,P5,0.694723229465017,0.685187866236727,0.676151256174569,0.64511186433406,0.559674938555207
7,P6,0.694481065903724,0.68434299550457,0.667593532288107,0.63748457448946,0.611747819401153
Tab. 4: Some Caption
X.Group.precision_cd_mean_100.precision_cd_mean_80.precision_cd_mean_60.precision_cd_mean_40.precision_cd_mean_20
1,P0,0.850291308712466,0.863959606847166,0.886572455196646,0.882431514133587,0.866542131107781
2,P1,0.881086965978777,0.881385354608281,0.907976419307475,0.913556538298391,0.904015736001189
3,P2,0.914292889813747,0.922962994236107,0.930314843950715,0.931462800948155,0.941764549449827
4,P3,0.938546702626195,0.948114551199824,0.947573886390383,0.955076973493339,0.956553861880396
5,P4,0.86182020998601,0.871375812204038,0.880264235531326,0.876480823266578,0.893989878930578
6,P5,0.926975036725319,0.931092307136025,0.938784889817978,0.940846630182665,0.934740625422158
7,P6,0.94556756464308,0.950069674129689,0.946904716260575,0.948315424482019,0.965895377520424
Tab. 5: Some Caption
X.Group.accuracy_cd_mean_100.accuracy_cd_mean_80.accuracy_cd_mean_60.accuracy_cd_mean_40.accuracy_cd_mean_20
1,P0,0.965735364978638,0.97535922857145,0.984596068676306,0.991907912419989,0.997785305019979
2,P1,0.947730636331472,0.962742734141691,0.976081054266584,0.987099005198878,0.996561517037704
3,P2,0.929015130528992,0.948636011753472,0.966011673185056,0.982543008856667,0.995184543904398
4,P3,0.923523671792385,0.944234890621593,0.96254613776216,0.980364343506102,0.994561596013455
5,P4,0.940916171554466,0.958799153184755,0.973076898937988,0.985473237733284,0.996200922947231
6,P5,0.93148572379301,0.948584989496925,0.966713944361515,0.982959295573816,0.995176251044659
7,P6,0.918784324904626,0.942586331518674,0.960301316463839,0.979138727665075,0.994430008707533
Tab. 6: Some Caption
X.Group.recall_rs_mean_100.recall_rs_mean_80.recall_rs_mean_60.recall_rs_mean_40.recall_rs_mean_20
1,P0,0.700656922076088,0.671971690808007,0.647565392370053,0.623211345358262,0.505670790515814
2,P1,0.710333206649999,0.690713933877071,0.645831442295169,0.594893770685101,0.526701435757958
3,P2,0.734769603026726,0.698043858984168,0.685469888432883,0.656821773830608,0.576745796084626
4,P3,0.754231923672347,0.73995213982794,0.717711265903131,0.671958424235267,0.5980032547638
5,P4,0.701327390014938,0.661239282820027,0.663477102045441,0.600553746755105,0.515006024615149
6,P5,0.744088816235394,0.727459541426902,0.708342780726585,0.674165727578511,0.584776549475389
7,P6,0.754966407593255,0.735043209268239,0.717037676579056,0.675062956505798,0.642556149973797
Tab. 6: Some Caption
X.Group.precision_rs_mean_100.precision_rs_mean_80.precision_rs_mean_60.precision_rs_mean_40.precision_rs_mean_20
1,P0,0.829024669772556,0.833749869467351,0.843214483922399,0.862652988025679,0.851871147484988
2,P1,0.849161533634262,0.849060329655819,0.861844415114962,0.875003934492772,0.8934165800958
3,P2,0.906876977218115,0.8924865078262,0.910678308915477,0.910683508871735,0.926562188444143
4,P3,0.923197595308841,0.931332802106321,0.933539952407828,0.940667064888899,0.949043230137492
5,P4,0.85797111774784,0.840451159127486,0.856470152457524,0.843075156104499,0.899121996469438
6,P5,0.907872698545233,0.911007052887016,0.913776103705454,0.917886773295178,0.920389254714107
7,P6,0.936089951054341,0.931753252258139,0.936840223666272,0.935016632340207,0.954113617641435
Tab. 6: Some Caption
X.Group.accuracy_rs_mean_100.accuracy_rs_mean_80.accuracy_rs_mean_60.accuracy_rs_mean_40.accuracy_rs_mean_20
1,P0,0.966260275629224,0.974556333098899,0.983773620085379,0.991534305308295,0.997768111529156
2,P1,0.949208997471603,0.962259705806662,0.974173866652148,0.986572524759527,0.996648196202323
3,P2,0.938712272219723,0.949996258993535,0.967583375483714,0.982941183858813,0.995249515515927
4,P3,0.934207305164418,0.94940779376826,0.965366634463291,0.98123949864011,0.994742196334439
5,P4,0.945382776008463,0.959165710155168,0.973444178476544,0.985267211126124,0.996373918389986
6,P5,0.938149758986248,0.952095856903702,0.96825321019248,0.983538915717438,0.995287349872281
7,P6,0.930061633752919,0.948673300461943,0.964555970474992,0.980665029211096,0.994765554447541